Toward Smaller, Faster, and Better Hierarchical Phrase-based SMT
نویسندگان
چکیده
We investigate the use of Fisher’s exact significance test for pruning the translation table of a hierarchical phrase-based statistical machine translation system. In addition to the significance values computed by Fisher’s exact test, we introduce compositional properties to classify phrase pairs of same significance values. We also examine the impact of using significance values as a feature in translation models. Experimental results show that 1% to 2% BLEU improvements can be achieved along with substantial model size reduction in an Iraqi/English two-way translation task.
منابع مشابه
A Topic Similarity Model for Hierarchical Phrase-based Translation
Previous work using topic model for statistical machine translation (SMT) explore topic information at the word level. However, SMT has been advanced from word-based paradigm to phrase/rule-based paradigm. We therefore propose a topic similarity model to exploit topic information at the synchronous rule level for hierarchical phrase-based translation. We associate each synchronous rule with a t...
متن کاملHierarchical phrase-based translation with weighted finite state transducers
This dissertation is focused in the Statistical Machine Translation field (SMT), particularly in hierarchical phrase-based translation frameworks. We first study and redesign hierarchical models using several filtering techniques. Hierarchical search spaces are based on automatically extracted translation rules. As originally defined they are too big to handle directly without filtering. In thi...
متن کاملA Joint Rule Selection Model for Hierarchical Phrase-Based Translation
In hierarchical phrase-based SMT systems, statistical models are integrated to guide the hierarchical rule selection for better translation performance. Previous work mainly focused on the selection of either the source side of a hierarchical rule or the target side of a hierarchical rule rather than considering both of them simultaneously. This paper presents a joint model to predict the selec...
متن کاملHierarchical Phrase-Based Statistical Machine Translation System
The aim of this thesis is to express fundamentals and concepts behind one of the emerging techniques in statistical machine translation (SMT) hierarchical phrase based MT by implementing translation from Hindi to English. Basically hierarchical model extends phrase based models by considering subphrases with the aid of context free grammar (CFG). In other models, syntax based models bear a rese...
متن کاملAutomated Grammar Correction Using Hierarchical Phrase-Based Statistical Machine Translation
We introduce a novel technique that uses hierarchical phrase-based statistical machine translation (SMT) for grammar correction. SMT systems provide a uniform platform for any sequence transformation task. Thus grammar correction can be considered a translation problem from incorrect text to correct text. Over the years, grammar correction data in the electronic form (i.e., parallel corpora of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009